The Effect of Database Size Distribution on Resource Selection Algorithms

نویسندگان

  • Luo Si
  • James P. Callan
چکیده

Resource selection is an important topic in distributed information retrieval research. It can be a component of a distributed information retrieval task and can also serve as an independent application of database recommendation system together with the resource representation part. There is a large body of valuable prior research on resource selection but very little has studied about the effects of different database size distributions on resource selection. In this paper, we propose extended versions of two well-known resource selection algorithms: CORI and KL divergence in order to consider the factors of database size distributions, and compare them with the lately proposed Relevant Document Distribution Estimation (ReDDE) resource selection algorithm. Experiments were done on four testbeds with different characteristics, and the ReDDE and the extended KL divergence resource selection algorithm have been shown to be more robust in various environments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Differential Evolution and Spatial Distribution based Local Search for Training Fuzzy Wavelet Neural Network

Abstract   Many parameter-tuning algorithms have been proposed for training Fuzzy Wavelet Neural Networks (FWNNs). Absence of appropriate structure, convergence to local optima and low speed in learning algorithms are deficiencies of FWNNs in previous studies. In this paper, a Memetic Algorithm (MA) is introduced to train FWNN for addressing aforementioned learning lacks. Differential Evolution...

متن کامل

Using Data Mining and Three Decision Tree Algorithms to Optimize the Repair and Maintenance Process

The purpose of this research is to predict the failure of devices using a data mining tool. For this purpose, at the outset, an appropriate database consists of 392 records of ongoing failures in a pharmaceutical company in 1394, in the next step, by analyzing 9 characteristics and type of failure as a database class, analyzes have been used. In this regard, three decision tree algorithms have ...

متن کامل

Distributed Information Retrieval With Skewed Database Size Distributions

The proliferation of government information on local area networks and the Internet creates the problem of finding information that may be distributed among many disjoint text databases (distributed information retrieval or federated search). A distributed information retrieval system is composed of three components: Resource representation, resource selection and result merging. Previous resea...

متن کامل

Introducing the Iranian moss flora explorer

Mosses (a section of bryophytes), are considered as an important group of non-flowering plants which a modern computer-assisted database system is not yet prepared for their determination and description key in Iran. Following a software package recently designed under a research project for assessment of moss diversity of Iran by the authors, the present paper is prepared to: I) introducing co...

متن کامل

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003